Vertex AI, Google Cloud’s fully managed and unified development platform, provides a comprehensive solution for leveraging models at scale, customizing models with enterprise-ready tuning, baselining, monitoring and deployment capabilities, and building AI agents.
With more than 150 first-party, open source and third-party base model options, Vertex AI accelerates innovation by providing customers such as ADT, IHG Hotels & Resorts, ING Bank and Verizon with a one-stop platform for building, deploying and maintaining AI applications and agents.
Key updates for Vertex AI at Google I/O 2024
At Google I/O 2024, a series of Vertex AI updates were announced, led by new models developed by Google DeepMind and other teams at Google and made available to Cloud customers:
Available Now:
Gemini 1.5 Flash: Available in public preview, Gemini 1.5 Flash delivers a groundbreaking context window of 1 million tokens, is lighter than 1.5 Pro, and is designed to effectively serve tasks like chat applications with speed and scale.
PaliGemma: Available in the Vertex AI Model Garden, PaliGemma is the first visual language model in the Gemma family of open models, ideal for tasks such as image captioning and visual question answering.
Coming Soon
Imagen 3: The highest quality text-to-image production model to date, capable of reproducing incredible levels of detail and producing photorealistic, lifelike images.
Gemma 2: The next generation in a family of open models designed for a wide range of AI developer use, Gemma 2 uses the same technologies used to build Gemini.
New features in Vertex AI
Google also announced new features such as context caching, controlled rendering, and batch API to help customers optimize model performance.
Context Caching: Context caching enables customers to actively manage and reuse cached context data, helping to significantly reduce transaction costs.
Controlled Creation: Controlled rendering enables customers to define Gemini model outputs according to specific formats or schemas, ensuring the format and syntax of model outputs.
Batch API: A super-efficient way to send a large number of latency-insensitive text prompt requests targeting use cases such as classification and sensitivity analysis, data extraction and annotation, the batch API helps speed up developer workflows and reduce costs.
Agent Builder: New open source integrations
Vertex AI Agent Builder enables developers to build and deploy AI experiences through a range of tools, from a code-free console for building AI agents using natural language to code-first open source orchestration frameworks like LangChain. To further power Agent Builder, Google has made Firebase Genkit and LlamaIndex available on Vertex AI.
Firebase Genkit: Announced by Firebase, Genkit is an open source Typescript/JavaScript framework designed to simplify the development, deployment and monitoring of production-ready AI agents.
LlamaIndex on Vertex AI: Simplifying the reach augmented manufacturing (RAG) process from data ingestion and transformation to embedding, indexing, retrieval and production, LlamaIndex offers a simple, flexible, open-source data framework for connecting custom data sources to generative models.
Baselining with Google Search
In addition to helping customers ground their output in their own private databases or designated “enterprise truth” sources, Google announced that Google Search Grounding is now generally available. It also expanded the scope of output compensation, bringing output grounded in Google Search into the scope of Productive AI compensated services.
With Vertex AI, Google aims to democratize AI innovation and support organizations to accelerate AI deployments in manufacturing.
{{user}} {{datetime}}
{{text}}